Search CORE

5 research outputs found

Reducing cache hierarchy energy consumption by predicting forwarding and disabling associative sets

Author: Apollini Ruben
Carazo Pablo
Castro Rodríguez Fernando
Chaver Martínez Daniel Ángel
Piñuel Moreno Luis
Tirado Fernández Francisco
Publication venue: 'World Scientific Pub Co Pte Lt'
Publication date: 01/01/2012
Field of study

The first level data cache in modern processors has become a major consumer of energy due to its increasing size and high frequency access rate. In order to reduce this high energy consumption, we propose in this paper a straightforward filtering technique based on a highly accurate forwarding predictor. Specifically, a simple structure predicts whether a load instruction will obtain its corresponding data via forwarding from the load-store structure - thus avoiding the data cache access - or if it will be provided by the data cache. This mechanism manages to reduce the data cache energy consumption by an average of 21.5% with a negligible performance penalty of less than 0.1%. Furthermore, in this paper we focus on the cache static energy consumption too by disabling a portion of sets of the L2 associative cache. Overall, when merging both proposals, the combined L1 and L2 total energy consumption is reduced by an average of 29.2% with a performance penalty of just 0.25%

Docta Complutense

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Archivo Digital UPM

Wavelet transform for large scale image processing on modern microprocessors

Author: Chaver Martínez Daniel Ángel
Piñuel Moreno Luis
Tenllado van der Reijden Christian
Tirado Fernández Francisco
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2003
Field of study

In this paper we discuss several issues relevant to the vectorization of a 2-D Discrete Wavelet Transform on current microprocessors. Our research is based on previous studies about the efficient exploitation of the memory hierarchy, due to its tremendous impact on performance. We have extended this work with a more detailed analysis based on hardware performance counters and a study of vectorization, in particular, we have used the Intel Pentium SSE instruction set. Most of our optimizations are performed at source code level to allow automatic vectorization, though some compiler intrinsic functions have been introduced to enhance performance. Taking into account the abstraction at which the optimizations are performed, the results obtained on an Intel Pentium III microprocessor are quite satisfactory, even though further improvement can be obtained by a more extensive use of compiler intrinsics

Docta Complutense

2-D wavelet transform enhancement on general-purpose microprocessors: memory hierarchy and SIMD parallelism exploitation

Author: Chaver Martínez Daniel Ángel
Piñuel Moreno Luis
Prieto Matías Manuel
Tenllado van der Reijden Christian
Tirado Fernández Francisco
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2002
Field of study

This paper addresses the implementation of a 2-D Discrete Wavelet Transform on general-purpose microprocessors, focusing on both memory hierarchy and SIMD parallelization issues. Both topics are somewhat related, since SIMD extensions are only useful if the memory hierarchy is efficiently exploited. In this work, locality has been significantly improved by means of a novel approach called pipelined computation, which complements previous techniques based on loop tiling and non-linear layouts. As experimental platforms we have employed a Pentium-III (P-III) and a Pentium-4 (P-4) microprocessor. However, our SIMD-oriented tuning has been exclusively performed at source code level. Basically, we have reordered some loops and introduced some modifications that allow automatic vectorization. Taking into account the abstraction level at which the optimizations are carried out, the speedups obtained on the investigated platforms are quite satisfactory, even though further improvement can be obtained by dropping the level of abstraction (compiler intrinsics or assembly code)

Docta Complutense

Funcionamiento de la herramienta OpenIRS-UCM y sus sinergias con Moodle

Author: Castro Rodríguez Fernando
Chaver Martínez Daniel Ángel
García Sánchez Carlos
Gómez Pérez José Ignacio
López Orozco José Antonio
Piñuel Moreno Luis
Tenllado Van Der Reijden Christian
Publication venue: 'Universidad Complutense de Madrid (UCM)'
Publication date: 01/01/2012
Field of study

Los sistemas de respuesta interactiva han ido ganando aceptación dentro de la comunidad educativa en los últimos años y una prueba clara de ello es el número creciente de los sistemas comerciales disponibles hoy en el mercado. Sin embargo, la mayoría de las soluciones se basan en sistemas que están cerrados, son rígidos y dependientes del software instalado en el computador del profesor. Presentamos en este trabajo una nueva herramienta gratuita que hemos denominado OpenIRS-UCM que incorpora la mayoría de las funcionalidades de las aplicaciones comerciales con la ventaja de integrar varios tipos de mandos comerciales con otros dispositivos como smartphones, PDAs, portátiles, etc. Además, permite interactuar con la plataforma del campus virtual de Moodle incrementando exponencialmente sus posibilidades de uso

Docta Complutense